12 research outputs found

    Stemming Algorithm in Searching Malay Text

    Get PDF
    Stemming is one of the processes that can be used to improve performance of a search engine. It reduces the variant word forms to common forms. This project evaluates the retrieval effectiveness of stemming algorithm in searching and retrieving relevant Malay Web pages based on user natural query words. The retrieved Web pages are weighted and ranked using Inverse Document Frequency function. The retrieval effectiveness is measured using standard recall and precision. Experiments performed show that searching with stemming improves retrieval effectiveness when compared to searching without stemming algorithm

    A Hybrid of Ant Colony Optimization Algorithm and Simulated Annealing for Classification Rules

    Get PDF
    Ant colony optimization (ACO) is a metaheuristic approach inspired from the behaviour of natural ants and can be used to solve a variety of combinatorial optimization problems. Classification rule induction is one of the problems solved by the Ant-miner algorithm, a variant of ACO, which was initiated by Parpinelli in 2001. Previous studies have shown that ACO is a promising machine learning technique to generate classification rules. However, the Ant-miner is less class focused since the rule’s class is assigned after the rule was constructed. There is also the case where the Ant-miner cannot find any optimal solution for some data sets. Thus, this thesis proposed two variants of hybrid ACO with simulated annealing (SA) algorithm for solving problem of classification rule induction. In the first proposed algorithm, SA is used to optimize the rule's discovery activity by an ant. Benchmark data sets from various fields were used to test the proposed algorithms. Experimental results obtained from this proposed algorithm are comparable to the results of the Ant-miner and other well-known rule induction algorithms in terms of rule accuracy, but are better in terms of rule simplicity. The second proposed algorithm uses SA to optimize the terms selection while constructing a rule. The algorithm fixes the class before rule's construction. Since the algorithm fixed the class before each rule's construction, a much simpler heuristic and fitness function is proposed. Experimental results obtained from the proposed algorithm are much higher than other compared algorithms, in terms of predictive accuracy. The successful work on hybridization of ACO and SA algorithms has led to the improved learning ability of ACO for classification. Thus, a higher predictive power classification model for various fields could be generated

    Searching Malay text using stemming algorithm

    Get PDF
    Stemming is an important process to improve performance of a search engine by reducing the variant word forms to common forms. This paper evaluates the retrieval effectiveness of stemming algorithm in searching and retrieving relevant Malay web pages based on user natural query words. The retrieved web pages are weighted and ranked using inverse document frequency function. The retrieval effectiveness is measured using standard recall and precision. Experiments performed show that searching with stemming improves retrieval effectiveness when compared to searching without stemming algorithm

    Smartphone assisted legal writing / Zeti Zuryani Mohd Zakuan and Rizauddin Saian

    Get PDF
    In Universiti Teknologi MARA, certain courses require students to take up law papers as one of the requirements to complete the course. Example of courses are Bachelor in Business Administration and Bachelor of Accountancy. These non-law students may not have difficulties when learning law but may find difficulties when trying to answer law questions as students are expected to answer exam question in form of an essay. Legal writing starts when students are required to answer law questions. Answering law questions is a huge problem to non-law students as they do not know where to start and what to write. For them, writing is perceived as a daunting task in law classes. To improve the writing process of answering law questions students are required to know the format of answering the questions. A study is carried out in order to find a new way to help non-law students who undertake law papers to answer law questions properly. The study developed a smartphone application that is able to guide the students throughout the writing process. It is called SALeW (Smartphone Assisted Legal Writing). It is timely for universities in Malaysia to venture into mobile learning as the interest and innovation of using mobile technologies has increase among students in Malaysia. SALeW is useful to the non-law students as it will guide the students as to the steps required in answering law questions. Lecturer needs to show the students what are required in the answer and later the students can practice writing on their own assisted by SALeW. SALeW can be used by having the students to install the application in their smartphones. The students then can use SALeW anywhere and everywhere to practise answering law questions as it will guide the students as to the steps required. By having SALeW, it will promote independent study which is in line with the 2015-2025 Malaysian Education Blueprint – Higher Education (PPPM-PT) which aims inter alia to produce holistic valuable graduates. Independent study is important to inculcate values among students which later can be reinforced and incorporated in the working environment upon graduation. Furthermore, this study also illustrates the support for blended learning program which is implemented in Universiti Teknologi MARA

    Ant colony optimization for rule induction with simulated annealing for terms selection

    Get PDF
    This paper proposes a sequential covering based algorithm that uses an ant colony optimization algorithm to directly extract classification rules from the data set.The proposed algorithm uses a Simulated Annealing algorithm to optimize terms selection, while growing a rule.The proposed algorithm minimizes the problem of a low quality discovered rule by an ant in a colony, where the rule discovered by an ant is not the best quality rule, by optimizing the terms selection in rule construction. Seventeen data sets which consist of discrete and continuous data from a UCI repository are used to evaluate the performance of the proposed algorithm.Promising results are obtained when compared to the Ant-Miner algorithm and PART algorithm in terms of average predictive accuracy of the discovered classification rules

    A new ant based rule extraction algorithm for web classification

    Get PDF
    Methods to reduce the number of attributes and discretization are two important data pre-processing steps before the data can be used for classification activity. Web documents contain enormous number of attributes as compared to other type of data. Ant-Miner algorithm is also still lacking in efficiency, accuracy and rule simplicity because of the local minima problem.Therefore, the Ant-Miner algorithm needs to be improved by taking into consideration of the accuracy and rule simplicity criteria so that it could be used to classify Web documents data sets or any large data sets.The best attribute selection method for Web texts categorization is the combination of correlation-based evaluation with random search as the search method.However, this attribute selection method will not give the best performance in attributes reduction. Using Classifier-based attribute subset selection will reduce more attributes, but sacrifice the performance of the classifier.A hybrid ant colony optimization with simulated annealing algorithm to discover rules from data is proposed.The simulated annealing technique will minimize the problem of low quality discovered rule by an ant in a colony.The best rule for a colony will then be chosen and later the best rule among the colonies will be included in the rule set.The best rule for a colony will then be chosen and later the best rule among the colonies will be included in the rule set.The rule set is arranged in decreasing order of generation.Thirteen data sets which consist of discrete and continuous data were used to evaluate the performance of the proposed algorithm in terms of accuracy, number of rules and number of terms in the rules.Experimental results obtained from the proposed algorithm are comparable to the results of the Ant-Miner algorithm in terms of rule accuracy but are better in terms of rule simplicity

    Comparison of attribute selection methods for web texts categorization

    Get PDF
    This paper presents a study on the performance of attribute selection methods to be used with Ant-Miner algorithm for web text categorization.The new generated data set by each attribute selection method was classified with Ant-Miner to see the performance in terms of predictive accuracy and the number of rules generated.The results of classification were also compared to C4.5 algorithm

    AFTA dan Malaysia: simulasi penciptaan (TC) dan herotan perdagangan (TD)

    Get PDF
    Kajian ini menilai kesan liberalisasi tarif dalam AFTA dan impaknya kepada perdagangan Malaysia-Asean.Dengan memfokus kepada pertumbuhan perdagangan Malaysia-Asean membolehkan hubungan liberalisasi tarif dan peningkatan perdagangan dua hala tersebut dinilai.Kaedah penganggaran berbentuk data panel dan model permintaan import panel digunakan.Data SITC 3 digit digunakan bagi mendapatkan gambaran sebenar kesan ke atas sektor-sektor yang dikaji.Penurunan tarif didapati memberikan kesan penciptaan perdagangan ketara bagi sektor keluaran tekstil.Manakala kesan herotan perdagangan adalah ketara bagi sektor kelapa sawit

    An enhancement of sliding window algorithm for rainfall forecasting

    Get PDF
    Various rainfall forecasting models or techniques are presented by researchers to obtain the best result of forecasting. Despite of various techniques and methods, not all of them produce the satisfy result of rainfall forecasting. Therefore, this study proposed a forecasting rainfall method based on sliding window algorithm (SWA), in order to obtain the best rainfall prediction value. The problem statements in this study are related to unsatisfactory accuracy rainfall output on previous study of SWA. Hence, SWA is enhanced in order to produce highly accurate prediction rainfall. The proposed method is tested by using three different rainfall gauge station data that are taken from the Department of Drainage Irrigation (DID), Perlis, Malaysia. Then, the rainfall forecasting result is validated by using mean square error (MSE), and relative geometric root mean squared error (relative GRMSE). The validation analysis shows that the proposed method has a higher forecasting accuracy than the previous method of sliding window algorithm

    Determining Intermediary Closely Related Languages to Find a Mediator for Intertribal Conflict Resolution

    Get PDF
    Indonesia has a diverse ethnic and cultural background. However, this diversity sometimes creates social problems, such as intertribal conflict. Because of the large differences among tribal languages, it is often difficult for conflicting parties to dialog for conflict resolution. To address this problem, we aim to find intermediary closely related languages from a language similarity knowledge graph using the best-performing pathfinding algorithms. In this research, we analyze the performances of two pathfinding algorithms, namely, Dijkstra and Yen’s K, by comparing their execution time and the total lexical distances of the intermediary languages (called “the cost”). Our research findings show that even though the Dijkstra and Yen’s K algorithms have equal total cost for all the cases, Yen’s K outperformed Dijkstra at searching for intermediary languages that are closely related, with an average of 160% higher performance on execution time. The selection of native speakers of the obtained intermediary languages as mediators is formalized as an optimization problem with four criteria: language similarity, geographical distance, background, and expected salary. We present a case study where the intermediary closely related languages can be used as a guideline to find mediators who can help resolve the intertribal conflicts among Indonesian tribes. To calculate the first criteria, we implemented the Yen’s K algorithm to calculate the shortest path between target languages and return the path via the intermediary languages. This implementation shows the potential use of the mediator selection model defined in this paper in various other roles such as trader or salesman, politician’s spokesman, reporter or journalist, etc
    corecore